_______________________________________________________
D/Noise 1.0d
A Digital Audio Denoising Tool
_______________________________________________________
Windows 95 version

(C)  1996 Fast Mathematical Algorithms and Hardware Corporation, 
1020 Sherman Avenue, Hamden, CT 06514. 
<http://www.fmah.com>


_______________________________________________________
INTRODUCTION
  This demonstration version is meant to illustrate some of our current 
work in the area of audio signal processing. It is in no way suited for 
commercial denoising. This version will only operate on monophonic 
16-bit WAV (Audio Interchange File Format) files. In addition, there 
is a limit on the size of the input file of one million sample points.

  This version of D/Noise does not support algorithm iteration, i.e.,
the denoising algorithm makes only a single pass through an audio file, 
separating it into what it thinks is coherent and what is noise. In the 
next release, you will be able to specify how many times the algorithm 
will pass through a file and its components to achieve a more thorough 
separation. In addition, you will be able to save a compressed version 
of the denoised file as well as well as apply some basic pre- and post-
processing transforms.

_______________________________________________________
Installation
----------------
This distribution consists of 4 files which should stay together in one 
folder:

  (1) Preliminary documentation (this file)

  (2) Dnoise.exe, the shell used to run the algorithm
  
  (3) Denoise.dll, the denoising algorithm in a dynamic link library

  (4) Caruso.wav, a sample audio file containing a snippet of Enrico 
      Caruso’s singing, recorded in 1904


Opening a WAV file for denoising
-------------------------------------------------
    D/Noise performs a one-pass denoising procedure on an open WAV 
file. To open a  file:
    [1] Select "Open..."  from the "File" menu.
    [2] Locate and open a file using the standard file dialog.

  You can run the denoising procedure on the entire file or just a short 
segment of it. To select a segment of your source file, click-drag across 
it with the mouse. The toolbar along the top of the main window has a 
couple standard controls for scrolling the wave form representation of 
the file, as well zooming in and out.

  The smallest length of the signal you can select is determined by the 
control at the right end of the toolbar at the bottom of the main 
window. This length also determines the size of the sliding signal 
window used in the denoising procedure. A length of 1,024 sample 
points is usually adequate. [NB: the other controls in the bottom 
toolbar are not functional yet and appear disabled.] 


Setting denoising parameters
----------------------------------------
  The outcome of the denoising procedure depends on the settings of 
various parameters. The exact meaning of these parameters is 
explained at the end of this document.

  To open the denoising algorithm interface, select "Configure..." from 
the "Denoise" menu.

  You can select one of two default parameter sets or enter your own. 
To select a default set, click on the "Default 1" or "Default 2" button in 
the "Parameters" frame.

  You can also set your own parameter values. You can use the [tab] 
key to jump from one box to the next. You will get an error message if 
you try to enter a value outside the range of a specific parameter.


Running the denoising procedure
-----------------------------------------------
[1] Setting the output files
    The denoising process will leave your original input file untouched 
and generate two new files. The first of these two new files will 
contain the coherent ("clean") component of the source file and the 
second will contain the noisy component. In an extended procedure 
you could run the process on the noisy file again to extract even more 
coherent parts and add those to the first clean file. This version of 
D/Noise does not yet support this type of iteration (although you can 
do this "by hand"). In the next release, you will be able to specify a 
number of iterations for the algorithm.

  Use the Select... buttons to select names and locations for the two 
output files [Hint: if your files are fairly small and you have RAM to 
spare, you may want to put the output files on a RAM disk to speed up 
the process and minimize disk thrashing. You will need the same 
amount of storage for each the coherent and the noisy file as you need 
for your source file].

[2] Starting the procedure
    Click the [Denoise All] or the [Denoise Selection] button at the 
bottom of the dialog box. The procedure starts and progress 
information is displayed. You can abort the procedure at any time by 
clicking the [Stop] button. Note, that it may take a little while before 
the algorithm stops, as event polling is kept at a minimum in order not 
to slow down the process. When finished, close the dialog box by 
clicking the [Done] button. You can now open and see/hear the 
resulting coherent and noise files.


_______________________________________________________
About the D/Noise Algorithm and its Control Parameters
    by Maxim J. Goldberg and Igor Popovic
_______________________________________________________

INTRODUCTION

  The D/Noise family of algorithms was developed for the purpose of 
removing noise from one dimensional signals, in particular, speech or 
music signals, by the method of denoising proposed by R. Coifman 
and V. Wickerhauser. One starts with a library of orthonormal 
waveforms, which typically includes wavelet packets and local 
trigonometric bases.  A signal is expanded in each basis, and a cost 
assigned to the expansion.  The basis giving rise to the least cost is 
chosen, the coefficients are ordered by magnitude, and a number of the 
leading terms is kept as the coherent part based on a predetermined 
threshold cost of the remaining terms. These leftover terms constitute 
by definition the noisy part of the signal, and can be treated as a new 
signal which can in turn be expanded and separated into its coherent 
and noisy components.

  In D/Noise, we use only one library of bases, those arising from the 
dyadic decomposition tree obtained by constructing local sines on the 
frequencies of a smoothly cut window from the signal.  A "best" basis 
is chosen by comparing the cost of a parent node to the sum of the 
costs of the 2 children. In D/Noise, the cost function can be chosen to 
be Shannon entropy or the lp of the coefficients of an expansion. We 
attempt to deal with numerical artifacts arising from the processing by 
(1) allowing shifts in time and frequency, and (2) by segmenting into 
large windows and only using the uncorrupted middle core. The large 
window we are using is 4 times the size of the core. For example, if 
the user selects a signal window of 1,024 samples, internally we slide 
and denoise a window of 4,096 samples and use only its 1,024 wide 
core in the reconstruction. This strategy has proven to give more 
pleasing results than any other "fancy" windowing.


PARAMETERS

(1) Window size
This parameter determines the number of consecutive samples 
processed at one time. Internally, the algorithm slides two "windows" 
of the selected width through the signal, offset by 1/2 their width. In 
addition, each window is extended to both its sides and only the core is 
used in the reconstruction after denoising. The windows should not be 
too narrow, since good frequency resolution is desirable, in particular 
for music. Nor should the windows be too wide, since information 
spread over time might mask local occurrences.  For music, it seems 
that a choice of 512, of 1024, or perhaps 2048 are the sizes to consider 
first.

(2) Log2 of reach
This is the log-base-2 of the size of the smallest interval to be 
considered by the local trigonometric transform decomposition tree. 
For example, the preset of 4 will give you 2^4=16 samples as the 
smallest interval to be considered.

(3) Energy threshold
This is the energy threshold for discarding coefficients from the 
extracted signal basis. This number, typically .0001, or .000001, 
means that in the chosen basis, those coefficients of size less than 
(energy threshold) * (energy of window segment) are set to zero and 
thus discarded.

(4) Entropy
    A real number alpha, to determine which entropy function will be used 
to separate out the noise component: alpha = 0.0 is Shannon entropy, 0 
< alpha < 1 stands for little-l-sub-p norm, where p is 2*alpha. For 
example, entering 0.5 will result in l1 norm being used.

(5) Entropy ratio
    This real number specifies the threshold mentioned in (3) above.  A 
ratio of 1.0 or higher means that all the entries of expansion of the 
window segment will be considered to be coherent, while a ratio of 0.0 
or less means that the entire signal coming from each window will be 
considered to be noise.  For music, a good testing entropy ratio may be 
between 0.3 and 0.4  if using Shannon entropy (alpha = 0.0 in (4) 
above); 0.7 works well for alpha = 0.5.

(6) Time shift
    A specific value k means the signal is padded with k zeros in front, 
the whole program is run, and then the output files are shifted back to 
the left by k samples.  The purpose of different shifts in time is to have 
the signal window cuts to occur in different places. It is recommended 
that any shifts chosen be prime, or nearly prime numbers, without 
high powers of two occurring in their factorization, and each shift is 
less than one half of the window size set in (1) above.
As mentioned previously, this version does not yet support any type of 
iteration. If you run the algorithm on the same file specifying a 
different time shift on each run, you will have to average the resulting 
files by hand, i.e. using some audio file mixing utility (a free utility 
package for AIFF files will be included with the next release).

(7) Frequency shifts
    In this field you can enter up to 9 integer numbers, each specifying 
a shift in the frequency domain of the signal. As in (5), the purpose is 
to average out cutting artifacts from the spectrum when performing the 
adapted local trigonometric transform on the signal's frequencies. 
Small primes are recommended, the default presets should suffice.